Flexible Harmonic/stochastic Modeling for Hmm-based Speech Synthesis

نویسندگان

Eleftherios Banos

Daniel Erro

Antonio Bonafonte

Asuncion Moreno

چکیده

In this paper the preliminary results, of a new approach on speech modeling for statistical parametric HMM-based speech synthesis are presented. The proposed system is based on a flexible pitch-asynchronous harmonic/stochastic model (HSM) [1]. The speech is modeled as the superposition of two components: a harmonic component and a stochastic or aperiodic component. The fact that the synthesis model is pitch-asynchronous allows the direct integration to a HMM-based synthesis system. HTS [2], an open source software toolkit that provides HMM-based speech synthesis was used. The proposed HSM method was compared to the HTS baseline system with the same configurations and database. A number of different experiments were conducted. Results show that high quality of synthesized utterances is reached. A small perceptual test was carried out comparing the two systems on quality of the synthetic voice and similarity to the original voice. HSM outperforms the HTS baseline system in the quality test: HSM 53%, HTS 35,3%, and undecided 11,7%. Concerning similarity to the original voice, HSM-performed slightly better than HTS: HSM 35,3%, HTS 29,4%, and undecided 35,3%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using FO Contour Generation Process Model for Improved and Flexible Control of Prosodie Features in HMM-based Speech Synthesis

Generation process model of fundamental frequency contours known as Fujisaki's model is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information included in the utterance. Therefore, by controlling fundamental frequency contours in the framework of the generation process model, a mo...

متن کامل

Estimation of resonant characteristics based on AR-HMM modeling and spectral envelope conversion of vowel sounds

A new method was developed for accurately separating source and articulation filter characteristics of speech. This method is based on the AR-HMM modeling, where the residual waveform is expressed as the output sequence from an HMM. To realize an accurate analysis, a scheme of dividing HMM state was newly introduced. Using the AR-filter parameter values obtained through the analysis, we can con...

متن کامل

Flexible harmonic/stochastic speech synthesis

In this paper, our flexible harmonic/stochastic waveform generator for a speech synthesis system is presented. The speech is modeled as the superposition of two components: a harmonic component and a stochastic or aperiodic component. The purpose of this representation is to provide a framework with maximum flexibility for all kind of speech transformations. In contrast to other similar systems...

متن کامل

Superpositional Modeling of Fundamental Frequency Contours for HMM-based Speech Synthesis

Statistical parametric speech synthesis technologies, such as HMM-based and DNN-based ones, gain special attention from researchers because of their ability in generating speech in various voice qualities and styles. In these methods, all acoustic parameters (except durational ones) are handled in a frame-by-frame manner, which is not appropriate for prosodic features. Although relation of adja...

متن کامل

A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis

Speech generated by parametric synthesizers generally suffers from a typical buzziness, similar to what was encountered in old LPC-like vocoders. In order to alleviate this problem, a more suited modeling of the excitation should be adopted. For this, we hereby propose an adaptation of the Deterministic plus Stochastic Model (DSM) for the residual. In this model, the excitation is divided into ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Flexible Harmonic/stochastic Modeling for Hmm-based Speech Synthesis

نویسندگان

چکیده

منابع مشابه

Using FO Contour Generation Process Model for Improved and Flexible Control of Prosodie Features in HMM-based Speech Synthesis

Estimation of resonant characteristics based on AR-HMM modeling and spectral envelope conversion of vowel sounds

Flexible harmonic/stochastic speech synthesis

Superpositional Modeling of Fundamental Frequency Contours for HMM-based Speech Synthesis

A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis

عنوان ژورنال:

اشتراک گذاری